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f|sj THE CLAIMS 

1. (Currently amended) A computer-based method of detecting one or more outliers in abM 
dimensional data set nf personal attri hiites. performed on a computer ihrou^i data mining, ike 

method comprising the steps of: 

determining one or more subsets of dimensions and corresponding ranges in the data set 
which are sparse in density using an algorithm capable of utilizing at least one of the processes of 
solution recombination, selection and mutation over a population of multiple solutions; and 

deteniuiungoneormoredatapomts^ 
and corresponding ranges, the one or more data points being identified as the one or more outliers 

in the data set. 

2. (Original) The method of claim 1, wherein arange is defined as a set of contiguous values 
on a given dimension. 

3. (Original) The method of claim 1, wherein the sets of dimensions and corresponding 
ranges in which the data is sparse in density is quantified by a sparsity coefficient measure. 

4. (Ctfginal)Themethodofclaim^^ 
n(D)-N*f k wher£ £ ^^1^ the number of dimensions in the data set, /represents 

4N*f k *a-f k y 



as 



detraction ofdata points in each range, Ms the total number ofdata points in the data set, and n(Z>) 
is the number of data points in a set of dimensions D. 

5. (Original) The method of claim 3, wherein a given sparsity coefficient measure is 
sely proportional to the number of data points in a given set of dimensions and corresponding 



inver: 
ranges 
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6. (Original) The method of claim 1 , wherein a set of dimensions is determined using an 
algorithm which uses the processes of solution recombination, selection and mutation over a 
population of multiple solutions. 

7. (Original) The method of claim 6, wherein the process of solution recombination 
comprises combining characteristics of two solutions in order to create two new solutions. 

8. (Original) The method of claim 6, wherein the process of mutation comprises changing 
a particular characteristic of a solution in order to result in a new solution. 

9. (Original) The method of claim 6, wherein the process of selection comprises biasing the 
population in order to favor solutions which are more optimum. 

1 0. (Currently amended) A u iu iyut a vbascd method of detecting one or more outliers in a 
hi rfi dimensional data set ^^"^1 attributes, performed on a comrniter through data mining, the 

method comprising the steps of: 

identifying and mining one or more sub-patterns in the data set which have abnormally low 
presence not due to randomness using an algorithm capable ofutilizing at least one of the processes 
of solution recombination, selection and mutation over a population of multiple solutions; and 

identifying one or more records which have the one or more sub-patterns present in them as 
the one or more outliers. 

11. (Currently amended) Apparatus for detecting one or more outliers in a hi ph dimensiona l 
data set of personal attributes thr f>"fT n data mining, comprising: 

at least one processor operative to: (i) determine one or more subsets of dimensions and 
corresponding ranges in the data set which are sparse in density using an algorithm capable of 
utilizing at least one of the processes of solution recombination, selection and mutation over a 
population of multiple solutions; and (ii) determine one or more data points in the data set which 
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contain these subsets of dimensions and corresponding ranges, the one or more data points being 
identified as the one or more outliers in the data set. 

12. (Original) The apparatus of claim 11, wherein a range is defined as a set of contiguous 

values on a given dimension. 

13. (Original) The apparatus of claim 1 1 , wherein the sets of dimensions and corresponding 
ranges in which the data is sparse in density is quantified by a sparsity coefficient measure. 

14. (Original) The apparatus of claim 13, wherein the sparsity coefficient measure S(Z>) is 
defmed as " (D) - N * fk , where k represents the number of dimensions in the data set,/ 

represents the fraction of data points in each range, iVis the total number of data points in the data 
set, and n(P) is the number of data points in a set of dimensions £>. 

15. (Original) The apparatus of claim 13, wherein a given sparsity coefficient measure is 
inversely proportional to the number of datapoints in a given set of dimensions and corresponding 
ranges. 

16. (Original) The apparatus of claim 1 1, wherein a set of dimensions is determined using 
an algorithm which uses the processes of solution recombination, selection and mutation over a 

population of multiple solutions. 

17. (Original) The apparatus of claim 16, wherein me process of solution recombination 
comprises combining characteristics of two solutions in order to create two new solutions. 

18. (Original) The apparatus of claim 16, wherein the process of mutation comprises 

changing a particular characteristic of a solution in -order to result in a new solution. 
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19. (Original) Hie apparatus of claim 16, wherein theprocess of selection comprises biasing 
the population in order to favor solutions which are more optimum. 

20. (Currently amended) Apparatus for detecting one or more outliers in a hi gb dimensional 
data set nf personal atf rftwrtes through data mining, comprising: 

atleastoner^cessoroperativeto: ^ 
set which have abnormally low presence not due to randomness using an algorithm capable of 
utilizing at least one of the processes of solution recombination, selection and mutation over a 
population of multiple solutions; and (ii) identify one or more records which have the one or more 
sub-patterns present in them as the one or more outliers. 

21. (Currently amended) An article of manufacture for detecting one or more outliers in a 
dimensional rfataset^ fT^rsnnal attributes th rnu^h data mming. comprising a niad^e readable 

medium containing one or more programs which when executed implement the steps of: 

determining one or more subsets of dimensions and corresponding ranges in the data set 

which are sparse in density using an algorithm capable of utilizing at least one of the processes of 

solution recombination, selection and mutation over a population of multiple solutions; and 

determining one or more data points in the data set which contain these sets subsets of 

dimensions and corresponding ranges, the one or more data points being identified as the one or 

more outliers in the data set. 

22. (Original) The article of claim 21, wherein a range is defined as a set of contiguous 
values on a given dimension. 

23. (Original) The article of claim 21, wherein the sets of dimensions and corresponding 
ranges in which me data is sparse in density is quantified by a sparsity coefficient measure. 
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24. (Original) The article of claim 23, wherein the sparsity coefficient measure S(D) is 

^ fined go n(D) ~ N * fk , where k represents the number of dimensions in the data set, / 
definedas ^ /Ww * ; ' 

represents the fraction of data points in each range, Nis the total number of data points in the data 
set, and w(P) is the number of data points in a set of dimensions D. 

25. (Original) Hie article of claim 23, wherein a given sparsity coefficient measure is 
inversely proportional to the number of data points in a given set of dimensions and corresponding 
ranges. 

26. (Original) The article of claim 21, wherein a set of dimensions is determined using an 
algorithm which uses the processes of solution recombination, selection and mutation over a 
population of multiple solutions. 

27. (Original) The article of claim 26, wherein the process of solution recombination 
comprises combining characteristics of two solutions in order to create two new solutions. 

28. (Original) The article of claim 26, wherein the process of mutation comprises changing 
a particular characteristic of a solution in order to result in a new solution. 

29. (Original) The article of claim 26, wherein the process of selection comprises biasing 
the population in order to favor solutions which are more optimum. 

30. (Currently amended) An article of manufacture for detecting one or more outliers in a 
W f h HhWonal ^r^nlattributes through dataminmg, comprising a machine readable 
medium containing one or more programs which when executed implement the steps of: 
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identifying and mining one or more sob-patterns in the data set which have abnormally tow 
presence not due to randomness using an algorithm capable of utilizing at least one of the processes 
of solution recombination, selection and mutation over a population of multiple solutions; and 

identifying one or more records which have the one or more sub-patterns present in them as 
the one or more outliers. 
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